perm filename CH2.5[206,JMC] blob
sn#070504 filedate 1973-11-06 generic text, type T, neo UTF8
00100 \\M0NGR25;\M1BDJ25;\M2BDR25X;\M3SUB;\M4NGR30;\M5NGR40;\M6SUP;\F0
00200
00300 \F45.Tree search.\F0
00400
00500 \J We begin with a general depth first tree search function.
00600 It can be used to search specific trees of possibilities by defining
00700 three auxiliary functions in a way that depends on the application.
00800 We have\.
00900
01000 \F1search p ← \F2if \F1lose p \F2then \F0LOSE\;
01100 \F2else if \F1ter p \F2then \F1p \F2else \F1searchlis[successors p]\F0
01200
01300 where
01400
01500 \F1searchlis u ← \F2 if n \F1u \F2then \F0LOSE \F2else \F1{search \F2a\;
01600 u}[λx.\F2if \F1x = \F0LOSE \F2then \F1searchlis \F2d \F1u \F2else \F1x].\F0
01700
01800 \JIn the applications, we start with a position \F1p\F30\F0, and we
01900 are looking for a win in the successor tree of \F1p\F30\F0. Certain
02000 positions lose and there is no point looking at their successors.
02100 This is decided by the predicate \F1lose\F0. A position is a win if
02200 it doesn't lose and it satisfies the predicate \F1ter\F0. The
02300 successors of a position is given by the function \F1successors\F0,
02400 and the value of \F1search p\F0 is the winning position. No
02500 non-losing position should have the name LOSE or the function won't
02600 work properly.
02700
02800 Our first example is finding a path from an initial node to
02900 a final node in a graph represented by a list structure as described
03000 in chapter I. A position is a path starting from the initial node
03100 and continuing to some intermediate node and is represented by a
03200 list of its nodes in reverse order. The three functions for this
03300 application are\.
03400
03500 \F1lose p ← \F2a \F1p \F0ε \F2d \F1p,
03600
03700 ter p ← [\F2a \F1p = final],\F0
03800
03900 and
04000
04100 \F1successors p ← mapcar[\F2d \F1assoc[\F2a \F1p,graph],λx.x.p].\F0
04200
04300 \J Another example is the so-called \F1Instant Insanity\F0 puzzle.
04400 There are four cubical blocks, and each face of each block is colored with
04500 one of four colors. The object of the puzzle is to build a tower of all four
04600 blocks such that each vertical face of the tower involves all four colors.
04700 In order to use the above defined function \F1search\F0 for this purpose,
04800 we must define the representation of positions and give the functions
04900 \F1lose, ter, \F0and \F1successors\F0. A position is represented by a list
05000 of lists - one for each face of the tower. Each sublist is the list of colors
05100 of the faces of the blocks showing in that face. We shall assume that the
05200 blocks are described in the following longwinded but convenient way. (We'll
05300 take up precomputing this description later.) For each block there is a list
05400 of the 24 orientations of the block where each orientation is described as
05500 a list of the colors around the vertical faces of the block when it is in that
05600 orientation. Thus the puzzle is described by a list of lists of lists which
05700 we shall call \F1puzz\F0.\.
05800
05900 We now have
06000
06100 \F1p\F30\F0 = (NIL NIL NIL NIL),
06200
06300 \F1lose p ← orlis[p,λu.\F2a \F1u \F0ε \F2d \F1u],
06400
06500 ter p ← [length \F2a \F1p = 4]\F0,
06600
06700 and
06800
06900 \F1successors p ← mapcar[\F2a \F1nth[puzz,1 + length \F2a \F1p]\;
07000 ,λx.mapcar2[p,x,λyz.z.y]],\F0
07100
07200 where
07300
07400 \F1mapcar2[u,v,f] ← \F1if n \F1u \F2then \F0NIL \F2else \;
07500 f[\F2a \F1u,\F2a \F1v] . mapcar2[\F2d \F1u,\F2d \F1v,f].\F0
07600
07700 \J Getting the initial position in the desired form is as complicated
07800 a computation as the actual tree search. It can be
07900 conveniently done by a sequence of assignment
08000 statements starting with a description of the blocks:\.
08100
08200 \F1puzz1 ← \F0((G B B W R G) (G G B G W R) (G W W R B R) (G G R B W W)).
08300
08400 \JHere each block is represented by a list of the colors of the faces starting
08500 with the top face, going around the sides in a clockwise direction and
08600 finishing with the bottom face.
08700
08800 We need to go from this description of the blocks to a list of the
08900 possible cycles of colors on the vertical faces for the 24 orientations of
09000 the block. This not easy, because the order in which we have given the colors
09100 is not invariant under rotations of the block. An easy way out is to start with
09200 a block whose faces are assigned the numbers 1 thru 6 starting with the top,
09300 going clockwise around the sides and finishing with the bottom. We write down
09400 one cycle of side colors for each choice of the face put on top and get the
09500 list of all 24 cycles by appending the results of generating the cyclic
09600 permutations of the cycles. All this is accomplished by the assignment\.
09700
09800
09900 \F1puzz2 ← cycles[\F0(2 3 4 5)]*\F1cycles[(\F0(2 5 4 3)]*\;
10000 \F1cycles[(1 2 6 4)]
10100 *cycles[(1 4 6 2)]*cycles[(1 3 6 5)]*cycles[(1 5 6 3)],\F0
10200
10300 where the function \F1cycles\F0 is defined by
10400
10500 \F1cycles u ← maplist[u,λx.x*upto[u,x]]\F0
10600
10700 with the auxiliary function
10800
10900 \F1upto[u,x] ← \F2if \F1x \F2eq \F1u \F2then \F0NIL \F2else a\;
11000 \F1u . upto[\F2d \F1u,x].\F0
11100
11200 \JNext we create for each block a list of substitutions expressing the
11300 colors of the six numbered faces. We have\.
11400
11500 \F1puzz3 ← mapcar[puzz1,λx.prup[\F0(1 2 3 4 5 6),\F1x]],\F0
11600
11700 \Jand we use these substitutions to get for each block the list of 24 orientations
11800 of the block. Thus\.
11900
12000 \F1puzz4 ← mapcar[puzz3,λs.sublis[s,puzz3]].
12100
12200 \Jpuzz4\F0 has all 24 orientations of the first block while for symmetry reasons
12300 we need only consider three as distinct, say the first, ninth, and seventeen. So
12400 we finally get\.
12500
12600 \F1puzz ← (\F2a \F1nth[\F2a \F1puzz4,1] \F2a \F1nth[\F2a \F1puzz4,9] \;
12700 \F2a \F1nth[\F2a \F1puzz4,17]) . \F2d \F1puzz4.\F0
12800
12900 \JThe program when compiled runs about 11 seconds on the PDP-10.
13000
13100 A more sophisticated representation of face cycles and partial towers
13200 makes a factor of ten speedup without changing the basic search algorithm. A
13300 face cycle is represented by 16 bits in a word four for each face a particular
13400 one of which being turned on tells us the color of the face. If we \F2or\F0
13500 these words together for the blocks in a partial tower we get a word which
13600 tells us for each face of the tower what colors have been used. A
13700 particular face cycle from the next block can be added to the tower if
13800 the \F2and\F0 of its word with the word representing the tower is zero.
13900 We represent a position by a list of words representing its partial
14000 towers with 0 as the last element and the highest partial tower as the
14100 first element. The virtue of this representation is that it makes
14200 the description of the algorithm short.
14300 The initial position is (0). The new \F1puzz\F0 can be formed from the
14400 old one by the assignment\.
14500
14600 \F1puzza ← mapcar[puzz,λx.mapcar[x,zap]],\F0
14700
14800 where
14900
15000 \F1zap v ← \F2if n \F1v \F2then \F10 \F2else \F1poo \F2a \F1v +\;
15100 16zap \F2d \F1v,\F0
15200
15300 and
15400
15500 \F1poo x ← \F2if \F1x=\F0R \F2then \F11 \F2else if \F1x=\F0W \F2then\;
15600 \F02 \F2else if \F1x=\F0G \F2then \F04 \F2else \F08.
15700
15800 \JNow we need the new functions \F1lose, ter, \F0and
15900 \F1successors\F0. These are\.
16000
16100 \F1lose p ← \F2false\F1,
16200
16300 \F1ter p ← [length p = 5],\F0
16400
16500 and
16600
16700 \F1successors p ← mapchoose[\F2a \F1nth[puzza,length p],\;
16800 λx.\F2a \F1p \F2and \F1x = 0,λx. [\F2a \F1p \F2or \F1x] . p].\F0
16900
17000 where
17100
17200 \F1mapchoose[u,pred,fn] ← \F2if n \F1u \F2then \F0NIL
17300 \F2else if \;
17400 \F1pred \F2a \F1u \F2then \F1fn[\F2a \F1u] . mapchoose[\F2d \F1u\;
17500 ,pred,fn] \F2
17600 else \F1mapchoose[\F2d \F1u,pred,fn].\F0
17700
17800 \J\F1lose\F0 is trivial, because the \F1mapchoose\F0 is used to make sure
17900 that only non-losing new positions are generated by \F1successors\F0.
18000 This version runs in a little less than one second on the PDP-10. A greater
18100 speedup can be made by the application of some mathematics. In fact, with
18200 enough mathematics, extensive tree search is unnecessary in this problem.
18300
18400 \F1search\F0 is used when we want to search a tree of possibilities
18500 for a solution to a problem. What if we want to find all solutions
18600 to a problem? This can be done with a function \F1allsol\F0 that uses
18700 the same \F1lose, ter, \F0and \F1successors\F0 functions as does \F1search\F0.
18800 The simplest way to write \F1allsol\F0 is\.
18900
19000 \F1allsol p ← \F2if \F1lose p \F2then \F0NIL \F2else if \F1ter p\;
19100 \F2then \F1(p) \F2else \F1mapapp[successors p,allsol],\F0
19200
19300 where
19400 \F1mapapp[u,fn] ← \F2if n \F1u \F2then \F0NIL \F2else \;
19500 \F1fn[\F2a \F1u] . mappap[\F2d \F1u,fn].\F0
19600
19700 \JThis form of the function is somewhat inefficient because of all the
19800 \F1append\F0ing. A more efficient form uses an auxiliary function as follows:\.
19900
20000 \F1allsol p ← allsola[p,\F0NIL]
20100
20200 \F1allsola[p,found] ← \F2if \F1lose p \;
20300 \F2then \F1found \F2else if \F1ter p \F2then \F1p . found\;
20400 \F2else \F1allsolb[successors p,found],
20500
20600 \F1allsolb[u,found] ← \F2if n \F1u \F2then \F1found \;
20700 \F2else \F1allsolb[\F2d \F1u,allsola[\F2a \F1u,found]].\F0
20800
20900 \JThe recursive program structure that arises here is common when a list
21000 is to be formed by recurring through a list structure.\.
21100
21200
21300
21400 \F46. Game trees.\F0
21500
21600 \J The positions that can be reached from an initial position in
21700 a game form a tree, and deciding what move to make often involves
21800 searching this tree. However, game trees are searched in a different
21900 way than the trees we have looked at, because the opposing interests
22000 of the players makes it not a search for a joint line of play that
22100 will lead to the first player winning, but rather a search for a
22200 strategy that will lead to a win regardless of what the other player
22300 does.
22400
22500 The simplest situation is characterized by a function
22600 \F1successors\F0 that gives the positions that can be reached in one
22700 move, a predicate \F1ter\F0 that tells when a position is to be
22800 regarded as terminal for the given analysis, and a function
22900 \F1imval\F0 that gives a number approximating the value of the
23000 position to one of the players. We shall call this player the
23100 maximizing player and his opponent the minimizing player. Usually,
23200 the numerical values arise, because the search cannot be carried out
23300 to the end of the game, and the analysis stops with reasonably static
23400 positions that can be evaluated by some rule. Naturally, the
23500 function \F1imval\F0 is chosen to be easy to calculate and to
23600 correlate well with the probability that the maximizing player can
23700 win the position.
23800
23900 The simplest rule for finding the correct move in a position
24000 uses auxiliary functions
24100 \F1valmax\F0 and \F1valmin\F0 that give a value to a position
24200 by using \F1imval\F0 if the position is terminal and taking the max
24300 or min of the successor positions otherwise.
24400
24500 For this we want functions for getting the maximum or the
24600 minimum of a function on a list, and they are defined as follows:\.
24700
24800 \F1maxlis[u,f] ← \F2if n \F1u \F2then \F1-∞ \F2else \F1\;
24900 max[f[\F2a \F1u],maxlis[\F2d \F1u,f]],\F0
25000
25100 and
25200
25300 \F1minlis[u,f] ← \F2if n \F1u \F2then \F1∞ \F2else \F1\;
25400 min[f[\F2a \F1u],minlis[\F2d \F1u,f]].\F0
25500
25600 \JIn these functions, -∞ and ∞ represent numbers that are smaller and
25700 larger than any actual values that will occur in evaluating \F1f\F0.
25800 If these numbers are not available, then it is necessary to prime
25900 the function with the values of \F1f\F0 applied to the first element
26000 of the list, and the functions are meaningless for null lists.
26100 Iterative versions of the functions are generally better; we give only
26200 the iterative version of \F1maxlis\F0, namely\.
26300
26400 \F1maxlis[u,f] ← maxlisa[u,-∞,f],\F0
26500
26600 where
26700
26800 \F1maxlisa[u,x,f] ← \F1if n \F1u \F2then \F1x \F2else \;
26900 \F1maxlisa[\F2d \F1u,max[x,f[\F2a \F1u]],f].\F0
27000
27100 We now have
27200
27300 \F1valmax p ← \F2if \F1ter p \F2then \F1imval p \F2else\;
27400 \F1maxlis[successors p,valmin]\F0,
27500
27600 and
27700
27800 \F1valmin p ← \F2if \F1ter p \F2then \F1imval p \F2else\;
27900 \F1minlis[successors p,valmax]\F0.
28000
28100 The best move is now chosen by
28200
28300 \F1movemax p ← bestmax[succesors p,valmin],\F0
28400
28500 or
28600
28700 \F1movemin p ← bestmin[succesors p,valmax],\F0
28800
28900 where
29000
29100 \F1bestmax[u,f] ← bestmaxa[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],
29200
29300 \F1bestmaxa[u,sofar,valsofar,f] ← \F2if n \F1u \F2then \F1sofar
29400 \F2else if \F1f[\F2a \F1u] ≤ bestsofar \F2then\;
29500 \F1bestmaxa[\F2d \F1u,sofar,bestsofar,f]
29600 \F2else \F1bestmaxa[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],
29700
29800 \F1bestmin[u,f] ← bestmina[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],\F0
29900
30000 and
30100
30200 \F1bestmina[u,sofar,valsofar,f] ← \F2if n \F1u \F2then \F1sofar
30300 \F2else if \F1f[\F2a \F1u] ≥ bestsofar \F2then\;
30400 \F1bestmina[\F2d \F1u,sofar,bestsofar,f]
30500 \F2else \F1bestmina[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f].\F0
30600
30700 \J However, this straight minimax computation of the best move does
30800 much more computation in general than is usually necessary. The simplest
30900 way to see this is from the move tree of Figure 2.6.\.
31000
31100
31200
31300
31400
31500
31600
31700
31800
31900
32000
32100 \CFigure 2.6
32200
32300 \JWe see from this figure that it is unnecessary to examine the moves marked
32400 x, because their values have no effect on the value of the game or on the
32500 correct move since the presence of the 9 is sufficient information at this
32600 point to show that the move leading to the vertex preceding it should not
32700 be chosen by the minimizing player.
32800
32900 The general situation is that it is unnecessary to examine further
33000 moves in a position once a move is found that refutes moving to the
33100 position in the first place. This idea is called the αβ-heuristic and
33200 expressed in the following recursive definitions:\.
33300
33400 \F1maxval p ← maxval1[p,-∞,∞],
33500
33600 maxval1[p,α,β] ← \F2if \F1ter p \F2then \F1max[α,min[β,imval p]]\;
33700 \F2else \F1maxval2[successors p,α,β],
33800
33900 maxval2[u,α,β] ← {minval1[\F2a \F1u,α,β]}[λw. \F2if \F1w = β \F2then\;
34000 \F1β \F2else \F1maxval2[\F2d \F1u,w,β]],
34100
34200 \F1minval p ← minval1[p,-∞,∞],
34300
34400 minval1[p,α,β] ← \F2if \F1ter p \F2then \F1max[α,min[β,imval p]]\;
34500 \F2else \F1minval2[successors p,α,β],
34600
34700 minval2[u,α,β] ← {maxval1[\F2a \F1u,α,β]}[λw. \F2if \F1w = α \F2then\;
34800 \F1α \F2else \F1minval2[\F2d \F1u,α,w].\F0
34900
35000 \J The reduction in number of positions examined given by the αβ algorithm
35100 over the simple minimax algorithm depends on the order in which the moves are
35200 examined. In the worst case, the moves happen to be examined in inverse order
35300 of merit in every position on the tree, i.e. the worst move first. In that case,
35400 there is no improvement over minimax. The best case is the one in which the
35500 best move in every position is examined first. If we look \F1n\F0 moves
35600 deep on a tree that has \F1k\F0 successors to each position, then minimax looks
35700 at \F1k\F6n\F0 positions while αβ looks at about only \F1k\F6n/2\F0. Thus a
35800 program that looks at 10\F64\F0 moves with αβ might have to look at 10\F68\F0
35900 with minimax. For this reason, game playing programs using αβ make a big
36000 effort to include as good as possible an ordering of moves into the \F1successors\F0
36100 function. When there is a deep tree search to be done, the way to make the
36200 ordering is with a short look-ahead to a much smaller depth than the main
36300 search. Still shorter look-aheads are used deeper in the tree, and beyond
36400 a certain depth, non-look-ahead ordering methods are of decreasing complexity.
36500
36600 A version of αβ incorporating optimistic and pessimistic evaluations
36700 of positions was first proposed by McCarthy about 1958. Edwards and Hart at
36800 M.I.T. about 1959 proved that the present version of αβ and calculated the
36900 improvement it gives over minimax. The first publication, however, was
37000 Brudno (1963). It is worth noting that αβ was not used in the early chess
37100 playing programs in spite of the fact that it is clearly used in any human
37200 play. Its non-use, therefore, represents a failure of self-observation.
37300 Very likely, there are a number of other algorithms used in human thought
37400 that we have not noticed an incorporated in our programs. To the extent
37500 that this is so, artificial intelligence will be \F1a posteriori\F0 obvious
37600 even though it is \F1a priori\F0 very difficult.\.